与特殊线性组和嵌入谎言代数结构具有基本关系。尽管谎言代数表示优雅,但很少有研究人员在同构估计与代数表达之间建立了联系。在本文中,我们提出了扭曲的卷积网络(WCN),以有效地估计SL(3)组和SL(3)代数的分组转换。为此,SL(3)组中的六个换向子组组成以形成一个跨摄影转换。对于每个子组,提出了一个翘曲函数,以将Lie代数结构桥接到其在断层扫描中的相应参数上。通过利用扭曲的卷积,同构估计得出了几个简单的伪翻译回归。通过沿着谎言拓扑行走,我们提出的WCN能够学习对构造转换不变的功能。它可以很容易地插入其他基于CNN的方法中。对POT基准和MNIST-PROJ数据集进行了广泛的实验表明,我们提出的方法对同型估计和分类都有效。
translated by 谷歌翻译
使用单像素检测,联合优化编码和解码的端到端神经网络可以实现高精度成像和高电平语义传感。然而,对于不同的采样率,大规模网络需要重新培训,这是呈现的呈现和计算消耗。在这封信中,我们报告了一种加权优化技术,用于动态速率自适应单像素成像和感应,只需要培训网络一次可用于任何采样率的时间一次。具体地,我们在编码过程中引入一种新的加权方案,以表征不同的模式的调制效率。虽然网络以高采样速率训练,但是迭代地更新调制模式和相应的权重,这在融合时产生最佳排名编码串。在实验实施方案中,采用最高重量的最佳模式系列用于光调制,从而实现高效的成像和感测。报告的策略节省了现有动态单像素网络所需另一种低速速率网络的额外培训,这进一步加倍训练效率。验证了Mnist DataSet上的实验,通过采样率为1的网络培训,平均成像PSNR为0.1采样率达到23.50 dB,并且图像的图像分类精度达到高达95.00 \%,以0.03的采样率达到95.00 \% 97.91 \%以0.1的采样率。
translated by 谷歌翻译
Planar对象跟踪在AI应用中起重要作用,例如机器人,视觉伺服和视觉SLAM。虽然前面的平面跟踪器在大多数情况下工作都很好,但由于两个连续帧之间的运动快,转换大,仍然是一个具有挑战性的任务。当同位参数空间的搜索范围变大时,这种问题背后面的基本原因是这种非线性系统的条件数不稳定地改变。为此,我们提出了一种新颖的单独分解网络〜(HDN)方法,通过将同性转换分解为两组,通过分解单独转换来稳定地减小和稳定条件号。具体地,设计相似性转换估计器被深度卷积设备网络预先预测第一组。通过利用高置信度的尺度和旋转估计,通过简单的回归模型估计残余转换。此外,所提出的端到端网络以半监督方式培训。广泛的实验表明,我们所提出的方法在挑战池,UCSB和诗歌数据集的大幅度上表现出最先进的平面跟踪方法。
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
The node-place model has been widely used to classify and evaluate transit stations, which sheds light on individual travel behaviors and supports urban planning through effectively integrating land use and transportation development. This article adapts this model to investigate whether and how node, place, and mobility would be associated with the transmission risks and presences of the local COVID-19 cases in a city. Similar studies on the model and its relevance to COVID-19, according to our knowledge, have not been undertaken before. Moreover, the unique metric drawn from detailed visit history of the infected, i.e., the COVID-19 footprints, is proposed and exploited. This study then empirically uses the adapted model to examine the station-level factors affecting the local COVID-19 footprints. The model accounts for traditional measures of the node and place as well as actual human mobility patterns associated with the node and place. It finds that stations with high node, place, and human mobility indices normally have more COVID-19 footprints in proximity. A multivariate regression is fitted to see whether and to what degree different indices and indicators can predict the COVID-19 footprints. The results indicate that many of the place, node, and human mobility indicators significantly impact the concentration of COVID-19 footprints. These are useful for policy-makers to predict and monitor hotspots for COVID-19 and other pandemics transmission.
translated by 谷歌翻译
Through a study of multi-gas mixture datasets, we show that in multi-component spectral analysis, the number of functional or non-functional principal components required to retain the essential information is the same as the number of independent constituents in the mixture set. Due to the mutual in-dependency among different gas molecules, near one-to-one projection from the principal component to the mixture constituent can be established, leading to a significant simplification of spectral quantification. Further, with the knowledge of the molar extinction coefficients of each constituent, a complete principal component set can be extracted from the coefficients directly, and few to none training samples are required for the learning model. Compared to other approaches, the proposed methods provide fast and accurate spectral quantification solutions with a small memory size needed.
translated by 谷歌翻译
Text-to-SQL semantic parsing is an important NLP task, which greatly facilitates the interaction between users and the database and becomes the key component in many human-computer interaction systems. Much recent progress in text-to-SQL has been driven by large-scale datasets, but most of them are centered on English. In this work, we present MultiSpider, the largest multilingual text-to-SQL dataset which covers seven languages (English, German, French, Spanish, Japanese, Chinese, and Vietnamese). Upon MultiSpider, we further identify the lexical and structural challenges of text-to-SQL (caused by specific language properties and dialect sayings) and their intensity across different languages. Experimental results under three typical settings (zero-shot, monolingual and multilingual) reveal a 6.1% absolute drop in accuracy in non-English languages. Qualitative and quantitative analyses are conducted to understand the reason for the performance drop of each language. Besides the dataset, we also propose a simple schema augmentation framework SAVe (Schema-Augmentation-with-Verification), which significantly boosts the overall performance by about 1.8% and closes the 29.5% performance gap across languages.
translated by 谷歌翻译
Practical applications employing deep learning must guarantee inference quality. However, we found that the inference quality of state-of-the-art and state-of-the-practice in practical applications has a long tail distribution. In the real world, many tasks have strict requirements for the quality of deep learning inference, such as safety-critical and mission-critical tasks. The fluctuation of inference quality seriously affects its practical applications, and the quality at the tail may lead to severe consequences. State-of-the-art and state-of-the-practice with outstanding inference quality designed and trained under loose constraints still have poor inference quality under constraints with practical application significance. On the one hand, the neural network models must be deployed on complex systems with limited resources. On the other hand, safety-critical and mission-critical tasks need to meet more metric constraints while ensuring high inference quality. We coin a new term, ``tail quality,'' to characterize this essential requirement and challenge. We also propose a new metric, ``X-Critical-Quality,'' to measure the inference quality under certain constraints. This article reveals factors contributing to the failure of using state-of-the-art and state-of-the-practice algorithms and systems in real scenarios. Therefore, we call for establishing innovative methodologies and tools to tackle this enormous challenge.
translated by 谷歌翻译
Previous computation models either have equivalent abilities in representing all computations but fail to provide primitive operators for programming complex algorithms or lack generalized expression ability to represent newly-added computations. This article presents a unified computation model with generalized expression ability and a concise set of primitive operators for programming high-level algorithms. We propose a unified data abstraction -- Tensor of List, and offer a unified computation model based on Tensor of List, which we call the ToL model (in short, ToL). ToL introduces five atomic computations that can represent any elementary computation by finite composition, ensured with strict formal proof. Based on ToL, we design a pure-functional language -- ToLang. ToLang provides a concise set of primitive operators that can be used to program complex big data and AI algorithms. Our evaluations show ToL has generalized expression ability and a built-in performance indicator, born with a strictly defined computation metric -- elementary operation count (EOPs), consistent with FLOPs within a small error range.
translated by 谷歌翻译
Medical Visual Question Answering (Medical-VQA) aims to answer clinical questions regarding radiology images, assisting doctors with decision-making options. Nevertheless, current Medical-VQA models learn cross-modal representations through residing vision and texture encoders in dual separate spaces, which lead to indirect semantic alignment. In this paper, we propose UnICLAM, a Unified and Interpretable Medical-VQA model through Contrastive Representation Learning with Adversarial Masking. Specifically, to learn an aligned image-text representation, we first establish a unified dual-stream pre-training structure with the gradually soft-parameter sharing strategy. Technically, the proposed strategy learns a constraint for the vision and texture encoders to be close in a same space, which is gradually loosened as the higher number of layers. Moreover, for grasping the semantic representation, we extend the unified Adversarial Masking data augmentation strategy to the contrastive representation learning of vision and text in a unified manner, alleviating the meaningless of the commonly used random mask. Concretely, while the encoder training minimizes the distance between the original feature and the masking feature, the adversarial masking model keeps adversarial learning to conversely maximize the distance. Furthermore, we also intuitively take a further exploration of the unified adversarial masking strategy, which improves the potential ante-hoc interpretability with remarkable performance and efficiency. Experimental results on VQA-RAD and SLAKE public benchmarks demonstrate that UnICLAM outperforms the existing 11 state-of-the-art Medical-VQA models. More importantly, we make an additional discussion about the performance of UnICLAM in diagnosing heart failure, verifying that UnICLAM exhibits superior few-shot adaption performance in practical disease diagnosis.
translated by 谷歌翻译